16 research outputs found

    Knowledge representation and text mining in biomedical, healthcare, and political domains

    Get PDF
    Knowledge representation and text mining can be employed to discover new knowledge and develop services by using the massive amounts of text gathered by modern information systems. The applied methods should take into account the domain-specific nature of knowledge. This thesis explores knowledge representation and text mining in three application domains. Biomolecular events can be described very precisely and concisely with appropriate representation schemes. Protein–protein interactions are commonly modelled in biological databases as binary relationships, whereas the complex relationships used in text mining are rich in information. The experimental results of this thesis show that complex relationships can be reduced to binary relationships and that it is possible to reconstruct complex relationships from mixtures of linguistically similar relationships. This encourages the extraction of complex relationships from the scientific literature even if binary relationships are required by the application at hand. The experimental results on cross-validation schemes for pair-input data help to understand how existing knowledge regarding dependent instances (such those concerning protein–protein pairs) can be leveraged to improve the generalisation performance estimates of learned models. Healthcare documents and news articles contain knowledge that is more difficult to model than biomolecular events and tend to have larger vocabularies than biomedical scientific articles. This thesis describes an ontology that models patient education documents and their content in order to improve the availability and quality of such documents. The experimental results of this thesis also show that the Recall-Oriented Understudy for Gisting Evaluation measures are a viable option for the automatic evaluation of textual patient record summarisation methods and that the area under the receiver operating characteristic curve can be used in a large-scale sentiment analysis. The sentiment analysis of Reuters news corpora suggests that the Western mainstream media portrays China negatively in politics-related articles but not in general, which provides new evidence to consider in the debate over the image of China in the Western media

    Learning to Extract Biological Event and Relation Graphs

    Get PDF
    Proceedings of the 17th Nordic Conference of Computational Linguistics NODALIDA 2009. Editors: Kristiina Jokinen and Eckhard Bick. NEALT Proceedings Series, Vol. 4 (2009), 18-25. © 2009 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/9206

    Pain process of patients with cardiac surgery — Semantic annotation of electronic patient record data

    Get PDF
    Aims and objectives: To describe and compare the pain process of the patients’ with cardiac surgery through nurses’ and physicians’ documentations in the electronic patient records.Background: Postoperative pain assessment and management should be documented regularly, to ensure optimal pain care process for patients. Despite availability of evidence‐based guidelines, pain assessment and documentation remain inadequate.Design: A retrospective patients’ record review.Methods: The original data consisted of the electronic patient records of 26,922 patients with a diagnosed heart disease. A total of 1,818 care episodes of patients with cardiac surgery were selected from the data. We used random sampling to obtain 280 care episodes for annotation. These 280 care episodes contained 2,156 physician reports and 1,327 days of nursing notes. We developed an annotation manual and schema, and then, we manually conducted semantic annotation on care episodes, using the Brat annotation tool. We analysed the annotation units using thematic analysis. Consolidated criteria for reporting qualitative research guideline was followed in reporting where appropriate in this study design.Results: We discovered expressions of six different aspects of pain process: (a) cause, (b) situation, (c) features, (d) consequences, (e) actions and (f) outcomes. We determined that five of the aspects existed chronologically. However, the features of pain were simultaneously existing. They indicated the location, quality, intensity, and temporality of the pain and they were present in every phase of the patient’s pain process. Cardiac and postoperative pain documentations differed from each other in used expressions and in the quantity and quality of descriptions.Conclusion: We could construct a comprehensive pain process of the patients with cardiac surgery from several electronic patient records. The challenge remains how to support systematic documentation in each patient.Relevance to clinical practice: The study provides knowledge and guidance of pain process aspects that can be used to achieve an effective pain assessment and more comprehensive documentation.<br /

    Comparative analysis of five protein-protein interaction corpora

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Growing interest in the application of natural language processing methods to biomedical text has led to an increasing number of corpora and methods targeting protein-protein interaction (PPI) extraction. However, there is no general consensus regarding PPI annotation and consequently resources are largely incompatible and methods are difficult to evaluate.</p> <p>Results</p> <p>We present the first comparative evaluation of the diverse PPI corpora, performing quantitative evaluation using two separate information extraction methods as well as detailed statistical and qualitative analyses of their properties. For the evaluation, we unify the corpus PPI annotations to a shared level of information, consisting of undirected, untyped binary interactions of non-static types with no identification of the words specifying the interaction, no negations, and no interaction certainty.</p> <p>We find that the F-score performance of a state-of-the-art PPI extraction method varies on average 19 percentage units and in some cases over 30 percentage units between the different evaluated corpora. The differences stemming from the choice of corpus can thus be substantially larger than differences between the performance of PPI extraction methods, which suggests definite limits on the ability to compare methods evaluated on different resources. We analyse a number of potential sources for these differences and identify factors explaining approximately half of the variance. We further suggest ways in which the difficulty of the PPI extraction tasks codified by different corpora can be determined to advance comparability. Our analysis also identifies points of agreement and disagreement in PPI corpus annotation that are rarely explicitly stated by the authors of the corpora.</p> <p>Conclusions</p> <p>Our comparative analysis uncovers key similarities and differences between the diverse PPI corpora, thus taking an important step towards standardization. In the course of this study we have created a major practical contribution in converting the corpora into a shared format. The conversion software is freely available at <url>http://mars.cs.utu.fi/PPICorpora</url>.</p

    BioInfer: a corpus for information extraction in the biomedical domain

    Get PDF
    BACKGROUND: Lately, there has been a great interest in the application of information extraction methods to the biomedical domain, in particular, to the extraction of relationships of genes, proteins, and RNA from scientific publications. The development and evaluation of such methods requires annotated domain corpora. RESULTS: We present BioInfer (Bio Information Extraction Resource), a new public resource providing an annotated corpus of biomedical English. We describe an annotation scheme capturing named entities and their relationships along with a dependency analysis of sentence syntax. We further present ontologies defining the types of entities and relationships annotated in the corpus. Currently, the corpus contains 1100 sentences from abstracts of biomedical research articles annotated for relationships, named entities, as well as syntactic dependencies. Supporting software is provided with the corpus. The corpus is unique in the domain in combining these annotation types for a single set of sentences, and in the level of detail of the relationship annotation. CONCLUSION: We introduce a corpus targeted at protein, gene, and RNA relationships which serves as a resource for the development of information extraction systems and their components such as parsers and domain analyzers. The corpus will be maintained and further developed with a current version being available at

    Organisaation inertia strategisessa julkisen sektorin yhdistymishankkeessa - Tapaus Aalto-yliopisto

    No full text
    Aalto University is a new Finnish private university that was created 1.1.2010 through the merger of three nationally leading universities of their own fields. The merger incorporated a strategic change effort in all significant activities of the organization, including management system, organizational structure, personnel, and culture in a short timeframe dictated by external political and economic forces. The overall rate of reforms that have been actually implemented during a relatively short transformation period challenges the picture of universities as extremely static organizations. This study focuses on explaining the organizational phenomena observed during the transformation effort by creating a grounded framework of organizational inertia. The study clearly illustrates the importance of organizational inertia as a concept, and its power to explain and classify complex organizational phenomena as a framework. The developed theory manages to provide an explanation for most of the relevant phenomena encountered during the study, with an apparent focus in institutional, political, legitimacy, and information issues in the public sector case. The study illustrates the significance of cultural, institutional, and decision-making issues, making a university organization a mess of politics, rivalry, problematic leadership and complex social networks. The results suggest that it may be very hard to lead the people in a university, but the management can certainly make an effort to lead the image of a university by creating a compelling transformation story as well as positive examples and role models, which may be powerful drivers of change. Furthermore, organizational inertia emerged not only as a force making the change more difficult, but the inertial themes observed in the pre-merger organization formed the rationale for the change itself, suggesting that certain forces can act simultaneously to initiate and inhibit strategic change by increasing the need while decreasing the ability to change.Aalto-yliopisto on uusi suomalainen säätiömuotoinen yliopisto, joka muodostettiin 1.1.2010 yhdistämällä kolme omien alojensa kansallisesti johtavaa yliopistoa. Yhdistyminen piti sisällään yliopiston kaikkien ydintoiminnallisuuksien strategisen kehittämisen, mukaan lukien esimerkiksi johtamisjärjestelmän, organisaatiorakenteen, työsuhteiden ja kulttuurin uudistamisen ulkoisten, poliittisten ja taloudellisten voimien sanelemassa tiukassa aikataulussa. Merkittävä osa suunnitelluista muutoksista on myös kyetty implementoimaan lyhyessä ajassa, mikä osin haastaa käsitystä yliopistoista erittäin staattisina organisaatioina. Tämä työ keskittyy selittämään muutoshankkeen aikana havaittuja organisatorisia ilmiöitä havaintoaineiston pohjalta luodun inertiamallin avulla. Tutkimus osoittaa selvästi organisaation inertian merkityksen konseptina, sekä inertian kyvyn selittää ja luokitella organisatorisia ilmiöitä mallina. Työssä luotu teoria onnistuu tarjoamaan selityksen useimpiin tutkimuksen aikana havaittuihin merkittäviin ilmiöihin, nostaen esiin tutkitun julkisen sektorin yhdistymishankkeen tapauksessa erityisesti institutionaalisia, poliittisia, legitimiteettiin liittyviä ja informaatiotekijöitä. Tutkimus osoittaa kulttuuristen, institutionaalisten ja päätöksentekokysymysten merkityksen yliopistoille, joista piirtyy kuva politiikan, omaneduntavoittelun, ongelmallisen johtamisen ja monimutkaisten sosiaalisten verkostojen värittämänä sekasotkuna. Tulokset antavat syytä olettaa, että ihmisten johtaminen yliopistossa on haastavaa. Johto voi kuitenkin selvästikin johtaa yliopiston imagoa luomalla onnistuneen muutostarinan sekä tuottamalla esimerkkejä ja esikuvia, jotka voivat toimia tehokkaina muutosajureina. Lisäksi tutkimuksessa osoittautui, että organisaation inertia ei toimi pelkästään muutosta vaikeuttavana voimana, vaan tosiasiassa muutosta edeltävässä organisaatiossa havaitut inertiatekijät muodostivat perustelun itse muutokselle. Näin ollen vaikuttaa siltä, että tietyt organisaatioissa vaikuttavat tekijät voivat samanaikaisesti aikaansaada ja estää strategista muutosta lisäämällä tarvetta mutta heikentämällä kykyä muutokseen